To be able to properly use the information included in the SportLogiq spatiotemporal data, we have to understand the classification of events. We’ll start with just exploring the distribution of event names.
table(Events$name)
##
## assist block carry
## 5908 112667 130323
## check controlledentry controlledentryagainst
## 31314 48205 44291
## dumpin dumpinagainst dumpout
## 43528 37632 42505
## faceoff goal goalagainst
## 71408 3530 3370
## icing lpr offside
## 4472 457433 3033
## pass penalty puckprotection
## 497660 5044 86031
## reception save shot
## 363965 33867 68317
assists <- filter(Events, name == 'assist')
table(assists$type)
##
## first second
## 3268 2640
Standard primary/secondary assist breakdown
blocks <- filter(Events, name == 'block')
table(blocks$type)
##
## blueline pass shot
## 10733 84198 17736
a <- name_type_function(blocks, 'blueline')
b <- name_type_function(blocks, 'pass')
c <- name_type_function(blocks, 'shot')
grid.arrange(a,b,c, ncol = 3)
Investigating what a blueline block is
a <- grep('blueline', Events$type) - 1
table(Events[a,'name'])
##
## assist block check dumpin dumpinagainst
## 11 43 1 5 3
## dumpout faceoff icing lpr pass
## 6660 8 6 430 3276
## penalty puckprotection reception save shot
## 19 60 188 6 17
Carry <- filter(Events, name == 'carry')
table(Carry$type)
##
## none
## 130323
No breakdowns for Carry. Seems to be associated with carries across both bluelines and the red line (with some strange outliers). Not sure what the spread of x-coordinates means around each of those lines.
Carry <- mutate(Carry, location = case_when(xadjcoord > -30 & xadjcoord < -20 ~ 'Defensive Blueline',
xadjcoord > -5 & xadjcoord < 5 ~ 'Redline',
xadjcoord > 20 & xadjcoord < 30 ~ 'Offensive Blueline',
TRUE ~ 'Other'))
ggplot(Carry, aes(x = xadjcoord, y = yadjcoord)) + geom_point() +
geom_vline(xintercept = 25, color = 'blue', size = 2) + geom_vline(xintercept = 0, color = 'red', size = 2) +
geom_vline(xintercept = -25, color = 'blue', size = 2) + xlim(-100,100) + theme_classic()
table(Carry$location)
##
## Defensive Blueline Offensive Blueline Other
## 45653 36160 4
## Redline
## 48506
Check <- filter(Events, name == 'check')
table(Check$type)
##
## body stick
## 7595 23719
Body and Stick check breakdown.
ControlledEntry <- filter(Events, name == 'controlledentry')
table(ControlledEntry$type)
##
## carry
## 13145
## carrywithplay
## 7492
## carrywithplaywithshotonnet
## 5025
## carrywithplaywithshotonnetandslotshot
## 6082
## carrywithplaywithslotshot
## 1599
## carrywithshotonnet
## 1078
## carrywithshotonnetandslotshot
## 1237
## carrywithslotshot
## 487
## pass
## 7207
## passwithplay
## 1455
## passwithplaywithshotonnet
## 1007
## passwithplaywithshotonnetandslotshot
## 1346
## passwithplaywithslotshot
## 344
## passwithshotonnet
## 268
## passwithshotonnetandslotshot
## 309
## passwithslotshot
## 124
This is a really nice breakdown. Does the leg-work of seeing what happened during the possession tied to the controlled entry.
ControlledEntry <- mutate(ControlledEntry, Carry_Pass = if_else(grepl('carry', ControlledEntry$type),
'Carry',
'Pass'))
table(ControlledEntry$Carry_Pass)
##
## Carry Pass
## 36145 12060
We also see that the Carry events tied to the offensive blueline from the section above seem to be tied to the Controlled Entry events defined as a carry (off by 15 on count, and don’t feel like investigating what happened with the ones that aren’t associated).
Might want to dive deeper to see how a pass entry is defined (or ask SportLogiq if they can answer that) in terms of distance from the offensive blueline.
ControlledEntryAgainst <- filter(Events, name == 'controlledentryagainst')
table(ControlledEntryAgainst$type)
##
## 1on0 1on1 1on2 2on1 2on2 2on3 3on1 3on2 3on3
## 683 3268 6122 1621 7200 7096 172 3191 14938
This is another really nice breakdown for analysis, as we can see what the offensive vs defensive numbers were like.
One thing that we might want clarification on is whether this only gets applied to the player closest to the controlled entry (we see 3191 occurrences of 3on2, but 3191 isn’t divisible by 2 as an example).
ControlledEntryAgainst <- mutate(ControlledEntryAgainst,
OffenseNumber = substr(type, 1, 1),
DefenseNumber = substr(type, 4, 4))
ggplot(ControlledEntryAgainst, aes(x = xadjcoord, y = yadjcoord)) +
geom_point() + geom_vline(xintercept = -25, color = 'blue', size = 2) + xlim(-100,100) +
geom_vline(xintercept = 25, color = 'blue', size = 2) + geom_vline(xintercept = 0, color = 'red', size = 2) +
geom_vline(xintercept = -25, color = 'blue', size = 2) +
labs(title = 'Location of ControlledEntryAgainst events broken down by the number of defenders') +
theme_classic() + theme(plot.title = element_text(hjust=0.5)) + facet_wrap(~DefenseNumber, ncol = 2)
From this, it is hard to tell how this get assigned. We see in every subgraph, some of these ControlledEntryAgainst events get assigned to the goalie. Might want to look at a graph where we somehow associate the entry event with their cooresponding entry against event. However, it is very clear that these get assigned to players in front of the puck, as there are almost no ControlledEntryAgainst events outsides to defensive zone.
DumpIn <- filter(Events, name == 'dumpin')
table(DumpIn$type)
##
## chip dump
## 5344 38184
Chip/Dump classification is likely defined by how far the puck got into the zone, but unsure.
DumpInAgainst <- filter(Events, name == 'dumpinagainst')
table(DumpInAgainst$type)
##
## none
## 37632
No breakdown for dump in against.
DumpOut <- filter(Events, name == 'dumpout')
table(DumpOut$type)
##
## boards flip ice
## 21472 6570 14463
We see three variants of a dump-out, Boards (likely using the boards), flip (likely flipping the puck into the air above the defenseman along the offensive blueline), and ice (which likely means they iced the puck, although unable to verify).
Faceoff <- filter(Events, name == 'faceoff')
table(Faceoff$type)
##
## none recovered
## 37448 17447
## recoveredwithentry recoveredwithexit
## 5954 6217
## recoveredwithshotonnet recoveredwithshotonnetandslotshot
## 1834 1478
## recoveredwithslotshot
## 1030
In a similar fashion to entry events above, we get a nice breakdown based on what happens following the faceoff.
Goal <- filter(Events, name == 'goal')
table(Goal$type)
##
## none
## 3530
No breakdown for goals, which is expected.
GoalAgainst <- filter(Events, name == 'goalagainst')
table(GoalAgainst$type)
##
## none
## 3370
GoalAgainst <- left_join(GoalAgainst, Players, by = 'playerid')
table(GoalAgainst$primaryPosition)
##
## D F G
## 0 0 3370
ggplot(GoalAgainst, aes(x = xadjcoord, y = yadjcoord)) + geom_point() + xlim(-100,100) +
geom_vline(xintercept = 25, color = 'blue', size = 2) + geom_vline(xintercept = 0, color = 'red', size = 2) +
geom_vline(xintercept = -25, color = 'blue', size = 2) + theme_classic()
My initial thought was that these events would only be tied to goalies (and would indicate the location of the goalie), but it looks like it indicates the location of the shot that scored on a goalie. Looks like 5 goals were scored on a goalie outside the offensive zone (ouch).
Icing <- filter(Events, name == 'icing')
table(Icing$type)
##
## none
## 4472
ggplot(Icing, aes(x = xadjcoord, y = yadjcoord)) + geom_point() + xlim(-100,100) +
geom_vline(xintercept = 25, color = 'blue', size = 2) + geom_vline(xintercept = 0, color = 'red', size = 2) +
geom_vline(xintercept = -25, color = 'blue', size = 2) + theme_classic()
We see some icing events that are on the offensive side of the red-line, which doesn’t make a lot of sense; however, its less than 10, so doesn’t seem like a big deal. interesting lack of icing events along the defensive blueline. Also really interesting to see the concentration of icings happening along the boards (makes sense though).
LPR <- filter(Events, name == 'lpr')
table(LPR$type)
##
## contested faceoff faceoffcontested
## 113712 26519 8952
## hipresopdump hipresopdumpcontested nofore
## 10378 5890 8532
## none nonecontested opdump
## 195030 8106 49870
## opdumpcontested rebound reboundcontested
## 5069 18987 6388
a <- name_type_function(LPR, 'hipresopdump')
b <- name_type_function(LPR, 'nofore')
c <- name_type_function(LPR, 'opdump')
d <- name_type_function(LPR, 'rebound')
grid.arrange(a,b,c,d, ncol = 2)
Offside <- filter(Events, name == 'offside')
table(Offside$type)
##
## none
## 3033
No breakdown for offside. Would have been potentially nice to know if it was offside on a zone entry, or on a keep-in, but if push comes to shove this can be done ourselves.
Pass <- filter(Events, name == 'pass')
table(Pass$type)
##
## d2d d2doffboards eastwest
## 53919 31116 36821
## eastwestoffboards none north
## 1803 509 52066
## northoffboards offboards outlet
## 50128 319 79422
## outletoffboards ozentry ozentryoffboards
## 34610 8810 2058
## ozentrystretch ozentrystretchoffboards rush
## 1217 729 7296
## rushoffboards slot south
## 610 36843 63632
## southoffboards stretch stretchoffboards
## 18245 12412 5095
Penalty <- filter(Events, name == 'penalty')
table(Penalty$type)
##
## indiscipline obstruction
## 2625 2419
Breakdown between undisciplined penalties and not. I don’t know how they determine if a penalty was unwarranted (doesn’t seem to be any correlation to location on the ice, as indicated below). We also don’t get any indication of what the penalty was for.
ggplot(Penalty, aes(x = xadjcoord, y = yadjcoord, color = type)) + geom_point() + xlim(-100,100) +
geom_vline(xintercept = 25, color = 'blue', size = 2) + geom_vline(xintercept = 0, color = 'red', size = 2) +
geom_vline(xintercept = -25, color = 'blue', size = 2) + theme_classic()
PuckProtection <- filter(Events, name == 'puckprotection')
table(PuckProtection$type)
##
## body deke
## 18442 67589
Almost four times as many deke compared to body puck protections. it seems that any deke would be placed as a puck protection, with success or failure indicated in the outcome column.
Reception <- filter(Events, name == 'reception')
table(Reception$type)
##
## none ozentry regular
## 5 8603 355357
there are reception events with a ‘failed’ outcome. I don’t know if this means that the player could have controlled the pass but didn’t for some unforced reason, or something else.
Save <- filter(Events, name == 'save')
table(Save$type)
##
## none onfailedblock
## 31413 2454
I assume the ‘onfailedblock’ means that a player attempted to block the shot but failed.
Shot <- filter(Events, name == 'shot')
table(Shot$type)
##
## outside outsideblocked slot slotblocked
## 28016 13301 21781 5219
This includes all shots (missed shots have ‘failed’ for outcome), and broken down by whether or not the shot was in the slot or not.
Shot <- mutate(Shot, slot = if_else(grepl('slot', Shot$type),
'slot',
'outside'))
ggplot(Shot, aes(x = xadjcoord, y = yadjcoord, color = slot)) + geom_point() + xlim(-100,100) +
geom_vline(xintercept = 25, color = 'blue', size = 2) + geom_vline(xintercept = 0, color = 'red', size = 2) +
geom_vline(xintercept = -25, color = 'blue', size = 2) + theme_classic()